UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

نویسندگان

  • Khurram Soomro
  • Amir Roshan Zamir
  • Mubarak Shah
چکیده

We introduce UCF101 which is currently the largest dataset of human actions. It consists of 101 action classes, over 13k clips and 27 hours of video data. The database consists of realistic user-uploaded videos containing camera motion and cluttered background. Additionally, we provide baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 43.9%. To the best of our knowledge, UCF101 is currently the most challenging dataset of actions due to its large number of classes, large number of clips and also unconstrained nature of such clips.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Do less and achieve more: Training CNNs for action recognition utilizing action images from the Web

Recently, attempts have been made to collect millions of videos to train CNN models for action recognition in videos. However, curating such large-scale video datasets requires immense human labor, and training CNNs on millions of videos demands huge computational resources. In contrast, collecting action images from the Web is much easier and training on images requires much less computation. ...

متن کامل

Two-Stream convolutional nets for action recognition in untrimmed video

We extend the two-stream convolutional net architecture developed by Simonyan for action recognition in untrimmed video clips. The main challenges of this project are first replicating the results of Simonyan et al, and then extending the pipeline to apply it to much longer video clips in which no actions of interest are taking place most of the time. We explore aspects of the performance of th...

متن کامل

Hand Detection and Tracking in Videos for Fine-Grained Action Recognition

In this paper, we develop an effective method of detecting and tracking hands in uncontrolled videos based on multiple cues including hand shape, skin color, upper body position and flow information. We apply our hand detection results to perform fine-grained human action recognition. We demonstrate that motion features extracted from hand areas can help classify actions even when they look fam...

متن کامل

Two-Stream SR-CNNs for Action Recognition in Videos

Human action is a high-level concept in computer vision research and understanding it may benefit from different semantics, such as human pose, interacting objects, and scene context. In this paper, we explicitly exploit semantic cues with aid of existing human/object detectors for action recognition in videos, and thoroughly study their effect on the recognition performance for different types...

متن کامل

Efficient Action and Event Recognition in Videos Using Extreme Learning Machines

EFFICIENT ACTION AND EVENT RECOGNITION IN VIDEOS USING EXTREME LEARNING MACHINES A great deal of research in computer vision community has gone into action and event recognition studies. Automatic video understanding for actions are crucial for application areas such as video indexing, surveillance and video summarization. In this thesis, we explore action and event recognition on RGB videos bo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1212.0402  شماره 

صفحات  -

تاریخ انتشار 2012